Campaign 2016 (PhD Chapter 1)


This series of files compile all analyses done during Chapter 1 for the regional campaign of 2016:

All analyses have been done with PRIMER-e 6 and R 3.6.2.

Click on the table of contents in the left margin to assess a specific analysis.
Click on a figure to zoom it

To assess maps and figures, click here.
To go back to the summary page, click here.


We used data from subtidal ecosystems (see metadata files for more information). Only stations that have been sampled both for abiotic parameters and benthic species were included.

Selected variables for the analyses:

Abundances of Mesodesma arctatum (Marc) and Cistenides granulata (Cgra) were also considered (see IndVal and SIMPER results).

As data is missing for metal concentrations outside BSI, two Designs have been used:


1. Data manipulation

For the following analyses, independant variables are habitat parameters and heavy metal concentrations, dependant variables are diversity indices.

1.1. Identification of outliers

To identify stations that are not consistent with the others, we used the multivariate Cook’s Distance (CD) on the uncorrelated variables. A significative threshold of 4 times the mean of CD has been established.

Design 1

Based on Cook’s Distance, we identified stations 60, 72, 80 and 96 as general outliers. They have been deleted for the following analyses of Design 1.

Design 2

Based on Cook’s Distance, we identified stations 108 and 110 as general outliers. They have been deleted for the following analyses of Design 2.

1.2. Correlations between parameters

Correlations have been calculated with Spearman’s rank coefficient.

Design 1

Correlation coefficients between habitat parameters (Design 1)
  om gravel sand silt clay
om 1 -0.068 -0.807 0.714 0.706
gravel -0.068 1 -0.192 -0.37 -0.329
sand -0.807 -0.192 1 -0.772 -0.768
silt 0.714 -0.37 -0.772 1 0.973
clay 0.706 -0.329 -0.768 0.973 1

According to these results, the following variables are highly correlated (\(|\rho|\) > 0.80) so they have been considered together in the regressions of Design 1:

  • silt and clay (clay deleted)

We decided to keep sand, even if it is correlated with om, to stay consistant with the 2014 campaign.

Design 2

Correlation coefficients between heavy metals concentrations (Design 2)
  arsenic cadmium chromium copper iron manganese mercury lead zinc
arsenic 1 0.492 0.736 0.876 0.773 0.399 0.646 0.816 0.903
cadmium 0.492 1 0.757 0.41 0.766 0.881 0.154 0.708 0.663
chromium 0.736 0.757 1 0.712 0.825 0.767 0.463 0.85 0.879
copper 0.876 0.41 0.712 1 0.633 0.38 0.572 0.829 0.89
iron 0.773 0.766 0.825 0.633 1 0.755 0.429 0.745 0.842
manganese 0.399 0.881 0.767 0.38 0.755 1 0.105 0.584 0.628
mercury 0.646 0.154 0.463 0.572 0.429 0.105 1 0.627 0.545
lead 0.816 0.708 0.85 0.829 0.745 0.584 0.627 1 0.898
zinc 0.903 0.663 0.879 0.89 0.842 0.628 0.545 0.898 1

According to these results, the following variables are highly correlated (\(|\rho|\) > 0.80) so they have been considered together in the regressions of Design 2:

  • cadmium and manganese (manganese deleted)
  • copper, lead and zinc (copper and zinc deleted)

We decided to keep arsenic, even though it is correlated with the copper/lead/zinc group, to stay consistant with the 2014 campaign.

2. Permutational Analyses of Covariance

Results of univariate PermANCOVAs on parameters and multivariate PermANCOVA on the whole benthic community with depth as covariate are presented in the table below. Variables were normalized and abundances were (log+1) transformed.

Variable Condition Region(Co) Depth Significative groups of similar regions (p > 0.05)
om S S {CPC BDA MR}
gravel All regions in the same group
sand S All regions in the same group
silt S S {BSI CPC BDA}, {BDA MR}
clay {BSI BDA MR}, {CPC MR}
S (1 mm) S {BSI CPC MR}, {CPC BDA MR}
N (1 mm) All regions in the same group
H (1 mm) s~ S {CPC BDA MR}, {BSI MR}
J (1 mm) {BSI CPC MR}, {CPC BDA MR}
ALL SPECIES (1 mm) S S

3. Similarity and characteristic species

Let’s have a look at the \(\beta\) diversity within our conditions and sites.

Results of the PERMDISP routine are shown below (mean and SE of the deviation from centroid for each group, i.e. multivariate dispersion), along with the mean Bray-Curtis dissimilarity for each group. Abundances were (log+1) transformed and PRIMER was used to do the PERMDISP.

Mean within-group Bray-Curtis dissimilarity for each condition or site
  Mean deviation SE of deviation Mean BC dissimilarity
HI 64.6 0.83 0.917
R 61.9 1.14 0.878
BSI 62.9 1.18 0.903
CPC 60.2 2.25 0.87
BDA 61.1 1.93 0.882
MR 58.2 2.12 0.835

No significative relationships were found for either factor by the PERMDISP (p = 0.069) or the pairwise tests.

The following analyses allowed to detect species as characteristic of each condition. We used results from PRIMER to justify further their choice.

##                       cluster indicator_value probability
## cistenides_granulata        1          0.2836       0.018
## macoma_calcarea             1          0.2326       0.002
## ennucula_tenuis             1          0.1860       0.018
## eudorellopsis_integra       1          0.1395       0.029
## mesodesma_arctatum          2          0.2342       0.007
## harmothoe_imbricata         2          0.1975       0.010
## glycera_alba                2          0.1212       0.039
## psammonyx_nobilis           2          0.1212       0.029
## 
## Sum of probabilities                 =  50.871 
## 
## Sum of Indicator Values              =  5.89 
## 
## Sum of Significant Indicator Values  =  1.52 
## 
## Number of Significant Indicators     =  8 
## 
## Significant Indicator Distribution
## 
## 1 2 
## 4 4
SIMPER results (mean Bray-Curtis between-group dissimilarity: 0.926)
  average sd ratio ava avb cumsum
echinarachnius_parma 0.0984 0.136 0.721 0.689 0.42 0.106
mesodesma_arctatum 0.07 0.129 0.542 0.605 0.0995 0.182
cistenides_granulata 0.0609 0.0948 0.643 0.176 0.565 0.248
strongylocentrotus_sp 0.0427 0.0758 0.563 0.27 0.249 0.294
nephtys_caeca 0.0425 0.0556 0.764 0.359 0.23 0.34
limecola_balthica 0.0313 0.0578 0.542 0.234 0.18 0.373
scoloplos_armiger 0.0295 0.065 0.453 0.14 0.256 0.405
macoma_calcarea 0.0274 0.0569 0.482 0 0.312 0.435
harmothoe_imbricata 0.0257 0.0583 0.44 0.217 0.0161 0.462
amphipholis_squamata 0.0238 0.0611 0.389 0.042 0.241 0.488
protomedeia_grandimana 0.0228 0.0538 0.424 0.183 0.169 0.513
psammonyx_nobilis 0.0189 0.0592 0.32 0.185 0 0.533
thyasira_sp 0.0186 0.0469 0.397 0.021 0.241 0.553
ennucula_tenuis 0.0185 0.0422 0.438 0 0.241 0.573
mya_arenaria 0.0174 0.034 0.513 0.063 0.168 0.592
ciliatocardium_ciliatum 0.014 0.045 0.312 0.0908 0.0766 0.607
goniada_maculata 0.0139 0.0354 0.391 0.021 0.173 0.622
glycera_dibranchiata 0.0134 0.043 0.31 0.021 0.0806 0.637
glycera_alba 0.0128 0.0408 0.313 0.172 0 0.65
ameritella_agilis 0.0117 0.0491 0.238 0 0.131 0.663
astarte_undata 0.0117 0.0388 0.301 0.142 0 0.676
astarte_subaequilatera 0.0106 0.0363 0.293 0.134 0 0.687
nucula_proxima 0.00992 0.0349 0.284 0 0.112 0.698
pygospio_elegans 0.00989 0.0449 0.22 0.137 0.0161 0.708
ophelia_limacina 0.00977 0.0299 0.327 0.042 0.0578 0.719
diastylis_sculpta 0.00966 0.0405 0.238 0.0488 0.0322 0.729
eudorellopsis_integra 0.00955 0.0267 0.358 0 0.153 0.74
ampharetidae_spp 0.00948 0.0277 0.342 0.0753 0.0535 0.75
yoldia_myalis 0.00913 0.0285 0.321 0.0543 0.0484 0.76
nephtys_bucera 0.00905 0.0256 0.354 0.063 0.0322 0.77
ampeliscidae_spp 0.00898 0.0253 0.354 0.063 0.0511 0.779
pontoporeia_femorata 0.00877 0.0404 0.217 0 0.132 0.789
bipalponephtys_neotena 0.00836 0.037 0.226 0 0.106 0.798
maldanidae_spp 0.00825 0.0272 0.303 0.0908 0.0322 0.807
pagurus_pubescens 0.00766 0.0231 0.331 0.0753 0.0161 0.815
polynoidae_spp 0.00756 0.0217 0.349 0.021 0.0952 0.823
ampharete_oculata 0.00725 0.0439 0.165 0.0666 0 0.831
phyllodoce_mucosa 0.00643 0.0241 0.267 0 0.106 0.838
phyllodocidae_spp 0.00629 0.0211 0.298 0.021 0.0484 0.845
phoxocephalus_holbolli 0.00621 0.0329 0.189 0 0.0827 0.851
testudinalia_testudinalis 0.00576 0.026 0.222 0.08 0 0.858
harpinia_propinqua 0.00547 0.0253 0.216 0.0753 0.0161 0.864
quasimelita_formosa 0.00486 0.0192 0.253 0 0.0739 0.869
nephtys_ciliata 0.00455 0.0213 0.214 0 0.0645 0.874
platyhelminthes 0.00429 0.0164 0.262 0 0.0484 0.878
lacuna_vincta 0.00427 0.0233 0.184 0 0.0417 0.883
cancer_irroratus 0.00405 0.0143 0.283 0.042 0.0161 0.887
nephtys_incisa 0.00399 0.0185 0.216 0.021 0.0161 0.892
arrhoges_occidentalis 0.00398 0.0167 0.239 0.0543 0 0.896

4. Univariate regressions

We used linear models for the all regressions on diversity indices. Outliers and correlated variables were removed from these analyses.

4.1. Simple regressions

These analyses have been do to explore the relationships between variables. As it is a huge number of results to interpret, only multiple regressions will be included in the article (see below).

Design 1

Adjusted R-squared of simple regressions for Design 1
  om gravel sand silt
S 0.09824 0.06215 0.0708 0.1258
N 0.01242 0.01491 0.03477 0.03467
H 0.09519 0.03329 0.06053 0.1134
J 0.004809 -0.0122 0.01178 0.01984
p-values of simple regressions for Design 1
  om gravel sand silt
S 0.00425 0.01962 0.01359 0.001309
N 0.1732 0.1542 0.06343 0.06371
H 0.004839 0.06765 0.02101 0.002229
J 0.2504 0.7054 0.1785 0.123

Design 2

Adjusted R-squared of simple regressions for Design 2
  arsenic cadmium chromium iron mercury lead
S -0.01268 -0.04896 -0.03331 -0.04823 -0.047 0.06622
N 0.008407 -0.04909 -0.03615 -0.04682 -0.04877 0.03425
H -0.01205 -0.03027 -0.001362 -0.02749 -0.02325 0.102
J -0.04952 -0.01768 -0.0304 -0.03285 -0.03656 -0.04851
p-values of simple regressions for Design 2
  arsenic cadmium chromium iron mercury lead
S 0.4008 0.8897 0.5762 0.8559 0.8132 0.1303
N 0.2907 0.8964 0.6107 0.8078 0.8796 0.2014
H 0.3967 0.543 0.3361 0.5155 0.478 0.08065
J 0.9251 0.4348 0.5443 0.5708 0.6162 0.8677

Furthermore, depth has been shown important for several parameters in the ANCOVAs, so here are the corresponding scatterplots.

4.2. Multiple regressions

This section presents analyses done (i) to determine which model (Design 1, Design 2) decribes the best the parameters and (ii) which variables are the most important to explain the parameters.

4.2.1. Best model selection

This step was not used here as both models were needed.

4.2.2. Significative variables selection

We identified which variables were selected after an AIC procedure to predict the best the parameters. Results of the variable selection, according to AIC, are shown on the tables below:

  • for the model of Design 1
Variable (or combination) S N H J
om
gravel - +
sand + - +
silt/clay + - + +
Adjusted \(R^{2}\) 0.17 0.1 0.18 0.02
  • for the model of Design 2
Variable (or combination) S N H J
arsenic
cadmium/manganese
chromium - - -
iron
mercury
lead/copper/zinc + + +
Adjusted \(R^{2}\) 0.29 0.16 0.21 0

Details of the regressions, with diagnostics and cross-validation, are summarized below.

Design 1

Species richness
## FULL MODEL
## Adjusted R2 is: 0.15
Fitting linear model: S ~ om + gravel + sand + silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.815 7.696 -0.7556 0.4526
om -0.1275 0.8173 -0.1561 0.8765
gravel 3.618 8.918 0.4057 0.6863
sand 10.11 7.78 1.3 0.1982
silt 15.05 9.997 1.505 0.137
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 2.60029
Variance Inflation Factors
  om gravel sand silt
VIF 2.01 2.35 8.23 9.4

## REDUCED MODEL
## Adjusted R2 is: 0.17
Fitting linear model: S ~ sand + silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.998 3.167 -0.9466 0.3471
sand 7.299 3.315 2.202 0.03102 *
silt 11.48 3.727 3.081 0.002963 * *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 2.515746
Variance Inflation Factors
  sand silt
VIF 3.55 3.55

Total abundance
## FULL MODEL
## Adjusted R2 is: 0.1
Fitting linear model: N ~ om + gravel + sand + silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 215.1 74.6 2.883 0.005295 * *
om 8.721 7.923 1.101 0.2749
gravel -244.9 86.46 -2.833 0.006085 * *
sand -199.8 75.42 -2.649 0.01006 *
silt -231.5 96.92 -2.389 0.01974 *
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 30.52945
Variance Inflation Factors
  om gravel sand silt
VIF 2.01 2.35 8.23 9.4

## REDUCED MODEL
## Adjusted R2 is: 0.1
Fitting linear model: N ~ gravel + sand + silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 176.7 66.07 2.674 0.009366 * *
gravel -201.2 76.9 -2.616 0.01094 *
sand -159.3 65.94 -2.416 0.0184 *
silt -166 76.63 -2.166 0.03379 *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 29.91383
Variance Inflation Factors
  gravel sand silt
VIF 2.09 7.18 7.42

Shannon index
## FULL MODEL
## Adjusted R2 is: 0.17
Fitting linear model: H ~ om + gravel + sand + silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.954 1.61 -1.834 0.07105
om -0.1016 0.171 -0.5938 0.5546
gravel 3.041 1.866 1.63 0.1079
sand 3.949 1.628 2.426 0.01798 *
silt 5.424 2.092 2.593 0.01168 *
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.5255413
Variance Inflation Factors
  om gravel sand silt
VIF 2.01 2.35 8.23 9.4

## REDUCED MODEL
## Adjusted R2 is: 0.18
Fitting linear model: H ~ gravel + sand + silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.507 1.417 -1.769 0.08133
gravel 2.532 1.649 1.535 0.1295
sand 3.477 1.414 2.459 0.01649 *
silt 4.662 1.644 2.836 0.006011 * *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.5269319
Variance Inflation Factors
  gravel sand silt
VIF 2.09 7.18 7.42

Piélou’s evenness
## FULL MODEL
## Adjusted R2 is: 0
Fitting linear model: J ~ om + gravel + sand + silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.3581 0.8682 -0.4124 0.6814
om -0.07058 0.0922 -0.7655 0.4467
gravel 1.11 1.006 1.104 0.2737
sand 1.07 0.8778 1.219 0.2273
silt 1.569 1.128 1.391 0.1688
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.3016641
Variance Inflation Factors
  om gravel sand silt
VIF 2.01 2.35 8.23 9.4

## REDUCED MODEL
## Adjusted R2 is: 0.02
Fitting linear model: J ~ silt
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.6944 0.0404 17.19 8e-27 * * *
silt 0.1853 0.1187 1.561 0.123
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.2855537
Variance Inflation Factors
  silt
VIF 1

Design 2

Species richness
## FULL MODEL
## Adjusted R2 is: 0.23
Fitting linear model: S ~ arsenic + cadmium + chromium + iron + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.292 2.311 3.587 0.002696 * *
arsenic -0.06374 0.2517 -0.2532 0.8035
cadmium -4.257 22.55 -0.1888 0.8528
chromium -0.1487 0.1001 -1.486 0.1581
iron -8.05e-05 0.0001006 -0.8002 0.4361
mercury -52.39 38.5 -1.361 0.1937
lead 2.059 0.6635 3.103 0.007277 * *
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 3.108151
Variance Inflation Factors
  arsenic cadmium chromium iron mercury lead
VIF 2.19 1.86 3.63 2.85 1.21 3.25

## REDUCED MODEL
## Adjusted R2 is: 0.29
Fitting linear model: S ~ chromium + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.618 1.478 4.479 0.0002574 * * *
chromium -0.1919 0.07173 -2.675 0.01499 *
lead 1.677 0.532 3.153 0.005237 * *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 2.508174
Variance Inflation Factors
  chromium lead
VIF 2.7 2.7

Total abundance
## FULL MODEL
## Adjusted R2 is: 0.04
Fitting linear model: N ~ arsenic + cadmium + chromium + iron + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 34.8 15.49 2.247 0.04011 *
arsenic 0.7435 1.686 0.4409 0.6656
cadmium 19.5 151.1 0.129 0.8991
chromium -0.7334 0.6704 -1.094 0.2912
iron -0.0005478 0.000674 -0.8128 0.429
mercury -236.2 258 -0.9155 0.3744
lead 8.931 4.446 2.009 0.06288
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 24.0044
Variance Inflation Factors
  arsenic cadmium chromium iron mercury lead
VIF 2.19 1.86 3.63 2.85 1.21 3.25

## REDUCED MODEL
## Adjusted R2 is: 0.16
Fitting linear model: N ~ chromium + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.22 9.576 2.529 0.02046 *
chromium -0.9287 0.4648 -1.998 0.06024
lead 8.207 3.447 2.381 0.0279 *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 16.71375
Variance Inflation Factors
  chromium lead
VIF 2.7 2.7

Shannon index
## FULL MODEL
## Adjusted R2 is: 0.06
Fitting linear model: H ~ arsenic + cadmium + chromium + iron + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.35 0.431 3.133 0.006841 * *
arsenic -0.02749 0.04693 -0.5857 0.5668
cadmium -0.5051 4.206 -0.1201 0.906
chromium -0.02101 0.01866 -1.126 0.2778
iron -4.814e-06 1.876e-05 -0.2566 0.801
mercury -4.576 7.18 -0.6373 0.5335
lead 0.2943 0.1237 2.379 0.03107 *
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.454785
Variance Inflation Factors
  arsenic cadmium chromium iron mercury lead
VIF 2.19 1.86 3.63 2.85 1.21 3.25

## REDUCED MODEL
## Adjusted R2 is: 0.21
Fitting linear model: H ~ chromium + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.296 0.2619 4.949 8.918e-05 * * *
chromium -0.02438 0.01271 -1.918 0.07024
lead 0.2364 0.09427 2.508 0.02137 *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.4430522
Variance Inflation Factors
  chromium lead
VIF 2.7 2.7

Piélou’s evenness
## FULL MODEL
## Adjusted R2 is: -0.23
Fitting linear model: J ~ arsenic + cadmium + chromium + iron + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5733 0.2064 2.778 0.01407 *
arsenic -0.006316 0.02247 -0.2811 0.7825
cadmium 0.4514 2.013 0.2242 0.8256
chromium 0.006524 0.008932 0.7304 0.4764
iron 2.023e-06 8.981e-06 0.2252 0.8248
mercury 2.501 3.437 0.7275 0.4781
lead -0.05506 0.05924 -0.9295 0.3674
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.2719167
Variance Inflation Factors
  arsenic cadmium chromium iron mercury lead
VIF 2.19 1.86 3.63 2.85 1.21 3.25

## REDUCED MODEL
## Adjusted R2 is: 0
Fitting linear model: J ~ 1
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.7955 0.04487 17.73 4.118e-14 * * *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.2163708

Quitting from lines 419-420 (C1_analyses_16B.Rmd) Error in Qr$qr[p1, p1, drop = FALSE] : indice hors limites De plus : There were 26 warnings (use warnings() to see them)

5. Multivariate regressions

Independant variables are habitat parameters or heavy metal concentrations, dependant variables are species abundances. Outliers and correlated variables have been excluded from the analysis.

This analysis has been done on PRIMER, with a DistLM to identify the variables that explain the most the community variability and with a dbRDA to plot the results.

Design 1

Variables selected by the DistLM procedure have a \(R^{2}\) of 0.08.

Design 2

Variables selected by the DistLM procedure have a \(R^{2}\) of 0.27.


Elliot Dreujou

2020-02-11